Aims of this practical

Task 1: Corpus operations

Before we work with text data in a more advanced manner, we will first start by using datasets that are contained in R and then move to loading external data sets on which we conduct text-based operations.

R - and quanteda specifically - contain numerous “built-in” datasets. You can find these under https://quanteda.io/reference/index.html#section-data.

By loading the quanteda package, these datasets are available in your workspace and can be accessed.

# we use the dataset of inaugural speeches by US presidents as the first example
data_corpus_inaugural
Corpus consisting of 59 documents and 4 docvars.
1789-Washington :
"Fellow-Citizens of the Senate and of the House of Representa..."

1793-Washington :
"Fellow citizens, I am again called upon by the voice of my c..."

1797-Adams :
"When it was first perceived, in early times, that no middle ..."

1801-Jefferson :
"Friends and Fellow Citizens: Called upon to undertake the du..."

1805-Jefferson :
"Proceeding, fellow citizens, to that qualification which the..."

1809-Madison :
"Unwilling to depart from examples of the most revered author..."

[ reached max_ndoc ... 53 more documents ]

To access the individual texts, you can simply index the object:

us_speeches[1]
Corpus consisting of 1 document and 4 docvars.
1789-Washington :
"Fellow-Citizens of the Senate and of the House of Representa..."

Note that this corpus object also contains dovcars (document-level variables). These are essential for later analyses and classification tasks. We can see what these variables are as follows:

See more on document variables - including how you can assign them (useful for later steps) here: https://tutorials.quanteda.io/basic-operations/corpus/docvars/. For now, it suffices to access the docvars in the usual form:

us_speeches$Year
 [1] 1789 1793 1797 1801 1805 1809 1813 1817 1821 1825 1829 1833 1837 1841 1845
[16] 1849 1853 1857 1861 1865 1869 1873 1877 1881 1885 1889 1893 1897 1901 1905
[31] 1909 1913 1917 1921 1925 1929 1933 1937 1941 1945 1949 1953 1957 1961 1965
[46] 1969 1973 1977 1981 1985 1989 1993 1997 2001 2005 2009 2013 2017 2021

Lastly, each document’s name can be accessed as:

docnames(us_speeches)
 [1] "1789-Washington" "1793-Washington" "1797-Adams"      "1801-Jefferson" 
 [5] "1805-Jefferson"  "1809-Madison"    "1813-Madison"    "1817-Monroe"    
 [9] "1821-Monroe"     "1825-Adams"      "1829-Jackson"    "1833-Jackson"   
[13] "1837-VanBuren"   "1841-Harrison"   "1845-Polk"       "1849-Taylor"    
[17] "1853-Pierce"     "1857-Buchanan"   "1861-Lincoln"    "1865-Lincoln"   
[21] "1869-Grant"      "1873-Grant"      "1877-Hayes"      "1881-Garfield"  
[25] "1885-Cleveland"  "1889-Harrison"   "1893-Cleveland"  "1897-McKinley"  
[29] "1901-McKinley"   "1905-Roosevelt"  "1909-Taft"       "1913-Wilson"    
[33] "1917-Wilson"     "1921-Harding"    "1925-Coolidge"   "1929-Hoover"    
[37] "1933-Roosevelt"  "1937-Roosevelt"  "1941-Roosevelt"  "1945-Roosevelt" 
[41] "1949-Truman"     "1953-Eisenhower" "1957-Eisenhower" "1961-Kennedy"   
[45] "1965-Johnson"    "1969-Nixon"      "1973-Nixon"      "1977-Carter"    
[49] "1981-Reagan"     "1985-Reagan"     "1989-Bush"       "1993-Clinton"   
[53] "1997-Clinton"    "2001-Bush"       "2005-Bush"       "2009-Obama"     
[57] "2013-Obama"      "2017-Trump"      "2021-Biden"     

Exercise 1.1

Which speech has the highest number of characters per word? And which one the lowest?

Hint: try to work with the native data.frame structure or with a data.table. This will require a conversion from the corpus object.

Exercise 1.2

Which speech contained the most punctuation?

Exercise 1.3

How has the average sentence length changes over time?

Task 2: First steps with real datasets

Use the data of statements on truthful and deceptive weekend plans that was the basis for this paper. You can find the raw textual data on the OSF: https://osf.io/rtq9y.

The participants were asked to either tell the truth about their plans for the upcoming weekend or were assigned an activity from someone else and had to lie about it (i.e., fabricate a story).

Each participant was asked two provide two statements (1. Please write about your weekend plans in as much detail as possible.; 2. Which information could prove that you are telling the truth?). Focus on the first question (called q1 in the dataset).

The variable outcome_class is either t (truthful) or d (deceptive).

Exercise 2.1

What is the effect size (Cohen’s d) for the difference in words per sentence between truthful and deceptive statements?

Task 3: Replicating Zipf’s Law

A curious “law” in corpus linguistics is Zipf’s Law (YouTube here).

Zipf’s Law describes the relationship between the frequency of words in a language and their rank in a frequency-sorted list: the frequency of any word is inversely proportional to its rank in the frequency table.

Key aspects of Zipf’s Law:

Exercise 3.1

The dataset we will use for this exercise stems from work a paper on analysing narrative shapes in YouTube vlog transcripts. In that paper, the video transcripts of 30k vlogs were analysed. The dataset can be loaded as follows:

Does Zipf’s Law apply to a corpus of YouTube vlog transcripts?

Hint: you will need to obtain the most common words for this analysis from that corpus. Have a look at the topfeatures() function. Here, put your tokenised object into a dfm (we will learn more about the dfm in the next part).

Exercise 3.2

How do the word frequency ranks in the vlogs corpus deviate from Google’s 1 Trillion Word Corpus frequency ranks?

You can find a ranked list of word frequencies from from Google’s Trillion Word Corpus at: https://github.com/first20hours/google-10000-english. It is also provided in the data directory of this repo (./data/google_10k_list.txt). These data are already in ranked order; the file does not contain a header (so set: header=F).


LS0tCnRpdGxlOiAnUHJhY3RpY2FsIEkgKHNvbHV0aW9ucyknCnN1YnRpdGxlOiBCZW5uZXR0IEtsZWluYmVyZwpkYXRlOiAnU3RhdGlzdGljYWwgTmF0dXJhbCBMYW5ndWFnZSBQcm9jZXNzaW5nIGluIFInCm91dHB1dDoKICBodG1sX2RvY3VtZW50OgogICAgdG9jOiB5ZXMKICAgIGRmX3ByaW50OiBwYWdlZAogICAgY29kZV9mb2xkaW5nOiBzaG93CiAgaHRtbF9ub3RlYm9vazoKICAgIHRoZW1lOiB1bml0ZWQKICAgIHRvYzogeWVzCiAgICBjb2RlX2ZvbGRpbmc6IHNob3cKICBwZGZfZG9jdW1lbnQ6CiAgICB0b2M6IHllcwogICAgY29kZV9mb2xkaW5nOiBzaG93Ci0tLQoKIyMgQWltcyBvZiB0aGlzIHByYWN0aWNhbAoKLSBHZXR0aW5nIHN0YXJ0ZWQgd2l0aCBoYW5kbGluZyB0ZXh0dWFsIGRhdGEgaW4gUgotIEJhc2ljIHN0ZXBzIGluIGRhdGEgY2xlYW5pbmcKLSBDYWxjdWxhdGluZyB0ZXh0IG1ldHJpY3MKLSBSZXBsaWNhdGluZyBaaXBmJ3MgTGF3CgoKIyMgVGFzayAxOiBDb3JwdXMgb3BlcmF0aW9ucwoKQmVmb3JlIHdlIHdvcmsgd2l0aCB0ZXh0IGRhdGEgaW4gYSBtb3JlIGFkdmFuY2VkIG1hbm5lciwgd2Ugd2lsbCBmaXJzdCBzdGFydCBieSB1c2luZyBkYXRhc2V0cyB0aGF0IGFyZSBjb250YWluZWQgaW4gUiBhbmQgdGhlbiBtb3ZlIHRvIGxvYWRpbmcgZXh0ZXJuYWwgZGF0YSBzZXRzIG9uIHdoaWNoIHdlIGNvbmR1Y3QgdGV4dC1iYXNlZCBvcGVyYXRpb25zLiAKClIgLSBhbmQgYHF1YW50ZWRhYCBzcGVjaWZpY2FsbHkgLSBjb250YWluIG51bWVyb3VzICJidWlsdC1pbiIgZGF0YXNldHMuIFlvdSBjYW4gZmluZCB0aGVzZSB1bmRlciBbaHR0cHM6Ly9xdWFudGVkYS5pby9yZWZlcmVuY2UvaW5kZXguaHRtbCNzZWN0aW9uLWRhdGFdKGh0dHBzOi8vcXVhbnRlZGEuaW8vcmVmZXJlbmNlL2luZGV4Lmh0bWwjc2VjdGlvbi1kYXRhKS4KCkJ5IGxvYWRpbmcgdGhlIGBxdWFudGVkYWAgcGFja2FnZSwgdGhlc2UgZGF0YXNldHMgYXJlIGF2YWlsYWJsZSBpbiB5b3VyIHdvcmtzcGFjZSBhbmQgY2FuIGJlIGFjY2Vzc2VkLgoKYGBge3J9CmxpYnJhcnkocXVhbnRlZGEpCmxpYnJhcnkoZGF0YS50YWJsZSkKCiMgd2UgdXNlIHRoZSBkYXRhc2V0IG9mIGluYXVndXJhbCBzcGVlY2hlcyBieSBVUyBwcmVzaWRlbnRzIGFzIHRoZSBmaXJzdCBleGFtcGxlCmRhdGFfY29ycHVzX2luYXVndXJhbApgYGAKCgpUbyBhY2Nlc3MgdGhlIGluZGl2aWR1YWwgdGV4dHMsIHlvdSBjYW4gc2ltcGx5IGluZGV4IHRoZSBvYmplY3Q6IAoKCmBgYHtyfQp1c19zcGVlY2hlcyA9IGRhdGFfY29ycHVzX2luYXVndXJhbAp1c19zcGVlY2hlc1sxXQoKYGBgCgpOb3RlIHRoYXQgdGhpcyBjb3JwdXMgb2JqZWN0IGFsc28gY29udGFpbnMgYGRvdmNhcnNgIChkb2N1bWVudC1sZXZlbCB2YXJpYWJsZXMpLiBUaGVzZSBhcmUgZXNzZW50aWFsIGZvciBsYXRlciBhbmFseXNlcyBhbmQgY2xhc3NpZmljYXRpb24gdGFza3MuIFdlIGNhbiBzZWUgd2hhdCB0aGVzZSB2YXJpYWJsZXMgYXJlIGFzIGZvbGxvd3M6CgpgYGB7cn0KZG9jdmFycyh1c19zcGVlY2hlcykKYGBgCgpTZWUgbW9yZSBvbiBkb2N1bWVudCB2YXJpYWJsZXMgLSBpbmNsdWRpbmcgaG93IHlvdSBjYW4gYXNzaWduIHRoZW0gKHVzZWZ1bCBmb3IgbGF0ZXIgc3RlcHMpIGhlcmU6IFtodHRwczovL3R1dG9yaWFscy5xdWFudGVkYS5pby9iYXNpYy1vcGVyYXRpb25zL2NvcnB1cy9kb2N2YXJzL10oaHR0cHM6Ly90dXRvcmlhbHMucXVhbnRlZGEuaW8vYmFzaWMtb3BlcmF0aW9ucy9jb3JwdXMvZG9jdmFycy8pLiBGb3Igbm93LCBpdCBzdWZmaWNlcyB0byBhY2Nlc3MgdGhlIGBkb2N2YXJzYCBpbiB0aGUgdXN1YWwgZm9ybToKCmBgYHtyfQp1c19zcGVlY2hlcyRZZWFyCmBgYAoKTGFzdGx5LCBlYWNoIGRvY3VtZW50J3MgbmFtZSBjYW4gYmUgYWNjZXNzZWQgYXM6CgpgYGB7cn0KZG9jbmFtZXModXNfc3BlZWNoZXMpCmBgYAoKCiMjIyBFeGVyY2lzZSAxLjEgCgoqKldoaWNoIHNwZWVjaCBoYXMgdGhlIGhpZ2hlc3QgbnVtYmVyIG9mIGNoYXJhY3RlcnMgcGVyIHdvcmQ/IEFuZCB3aGljaCBvbmUgdGhlIGxvd2VzdD8qKgoKX0hpbnQ6IHRyeSB0byB3b3JrIHdpdGggdGhlIG5hdGl2ZSBkYXRhLmZyYW1lIHN0cnVjdHVyZSBvciB3aXRoIGEgZGF0YS50YWJsZS4gVGhpcyB3aWxsIHJlcXVpcmUgYSBjb252ZXJzaW9uIGZyb20gdGhlIGNvcnB1cyBvYmplY3QuXwoKYGBge3J9CiMgT3B0aW9uIDE6IHVzZSB0aGUgbmF0aXZlIGRhdGEuZnJhbWUgc3RydWN0dXJlIChwcmVmZXJhYmx5OiBkYXRhLnRhYmxlIGZvciBzaWduaWZpY2FudGx5IGZhc3RlciBwcm9jZXNzaW5nIGZvciBsYXJnZXIgY29ycG9yYSAtIHNlZSBiZWxvdykKCiMjICBjb252ZXJ0IHRoZSBjb3JwdXMgdG8gYSBkYXRhLmZyYW1lCnVzX2NvcnB1c19kZiA9IGNvbnZlcnQodXNfc3BlZWNoZXMsIHRvID0gJ2RhdGEuZnJhbWUnKQpuYW1lcyh1c19jb3JwdXNfZGYpCgojIyBhZGQgY29sdW1ucyB0byB0aGUgZGF0YS5mcmFtZQp1c19jb3JwdXNfZGYkbmNoYXJzID0gbmNoYXIodXNfY29ycHVzX2RmJHRleHQpCnVzX2NvcnB1c19kZiRudG9rcyA9IG50b2tlbihxdWFudGVkYTo6dG9rZW5zKHVzX2NvcnB1c19kZiR0ZXh0LCB3aGF0ID0gJ3dvcmQnKSkgI25vdGUgdGhlIGZvcmNpbmcgdG8gdXNlIHF1YW50ZWRhJ3MgdG9rZW5zIGZ1bmN0aW9uOyB0aGlzIGlzIGR1ZSB0byBvdGhlciBwYWNrYWdlcyBhbHNvIGNvbnRhaW5pbmcgdGhpcyBmdW5jdGlvbgoKIyMgY3JlYXRlIHRoZSB2YXJpYWJsZSBvZiBjaGFyYWN0ZXJzIHBlciB3b3JkCnVzX2NvcnB1c19kZiRjcHcgPSB1c19jb3JwdXNfZGYkbmNoYXJzL3VzX2NvcnB1c19kZiRudG9rcwoKIyMgbGFzdGx5OiBmaW5kIHRoZSBoaWdoZXN0IGFuZCBsb3dlc3QKdXNfY29ycHVzX2RmW3doaWNoLm1heCh1c19jb3JwdXNfZGYkY3B3KSwgXQp1c19jb3JwdXNfZGZbd2hpY2gubWluKHVzX2NvcnB1c19kZiRjcHcpLCBdCgoKIyBPcHRpb24gMjogdXNpbmcgZGF0YS50YWJsZSBmb3IgdGhlIHN0ZXBzIGFib3ZlIAojIyBUaGUgc3ludGF4IGlzIHNvbWV3aGF0IGRpZmZlcmVudCBidXQgc3Ryb25nbHkgcmVjb21tZW5kIGZvciBmYXN0ZXIgcHJvY2Vzc2luZy4KCnVzX2NvcnB1c19kdCA9IHNldERUKGNvbnZlcnQodXNfc3BlZWNoZXMsIHRvID0gJ2RhdGEuZnJhbWUnKSkgIyBUaGUgdHJpY2sgaGVyZSBpcyB0byBnbyB2aWEgYSBkZiBmaXJzdCBhbmQgdGhlbiBzZXQgdGhlIGRhdGEudGFibGUKCnVzX2NvcnB1c19kdFssIGNwdyA6PSBuY2hhcih0ZXh0KS9udG9rZW4ocXVhbnRlZGE6OnRva2Vucyh0ZXh0KSldCgp1c19jb3JwdXNfZHRbd2hpY2gubWF4KGNwdyksIF0KdXNfY29ycHVzX2R0W3doaWNoLm1pbihjcHcpLCBdCmBgYAoKCiMjIyBFeGVyY2lzZSAxLjIgCgoqKldoaWNoIHNwZWVjaCBjb250YWluZWQgdGhlIG1vc3QgcHVuY3R1YXRpb24/KioKCmBgYHtyfQojIFVzaW5nIGRhdGEudGFibGUKIyMgVXNpbmcgdGhlIHRva2Vuc19zZWxlY3QoKSBmdW5jdGlvbiBhbmQgZGVmaW5pbmcgYSBwdW5jdHVhdGlvbiByZWdleCB0byBrZWVwIHRoZSBzZWxlY3Rpb24gb2YgYWxsIHB1bmN0dWF0aW9uLCB0aGVuIGNvdW50IHdoYXQgaXMgbGVmdCBvdmVyCnVzX2NvcnB1c19kdFssIG5fcHVuY3QgOj0gbnRva2VuKHRva2Vuc19zZWxlY3QocXVhbnRlZGE6OnRva2Vucyh0ZXh0KQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICwgcGF0dGVybiA9ICJbWzpwdW5jdDpdXSIKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAsIHZhbHVldHlwZSA9ICJyZWdleCIKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAsIHNlbGVjdGlvbiA9ICdrZWVwJykpXQoKIyBub3RlOiB0aGlzIGlzIGVxdWl2YWxlbnQgdG8gdGhpcyBkYXRhLmZyYW1lIG5vdGF0aW9uCnVzX2NvcnB1c19kZiRuX3B1bmN0ID0gbnRva2VuKHRva2Vuc19zZWxlY3QocXVhbnRlZGE6OnRva2Vucyh1c19jb3JwdXNfZGYkdGV4dCkKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAsIHBhdHRlcm4gPSAiW1s6cHVuY3Q6XV0iCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgLCB2YWx1ZXR5cGUgPSAicmVnZXgiCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgLCBzZWxlY3Rpb24gPSAna2VlcCcpKQoKCiMjIFdlIG1heSB3YW50IHRvIHN0YW5kYXJkaXNlIHRoaXMgYnkgdGhlIGxlbmd0aCBvZiB0aGUgc3BlZWNoCnVzX2NvcnB1c19kdFssIHByb3BfcHVuY3QgOj0gbl9wdW5jdC9udG9rZW4ocXVhbnRlZGE6OnRva2Vucyh0ZXh0KSldCgp1c19jb3JwdXNfZHRbd2hpY2gubWF4KG5fcHVuY3QpLCBdCnVzX2NvcnB1c19kdFt3aGljaC5tYXgocHJvcF9wdW5jdCksIF0KCmBgYAoKIyMjIEV4ZXJjaXNlIDEuMyAKCioqSG93IGhhcyB0aGUgYXZlcmFnZSBzZW50ZW5jZSBsZW5ndGggY2hhbmdlcyBvdmVyIHRpbWU/KioKCmBgYHtyfQojIGNhbGN1bGF0ZSB3b3JkcyBwZXIgc2VudGVuY2UKdXNfY29ycHVzX2R0WywgbnRva3MgOj0gbnRva2VuKHF1YW50ZWRhOjp0b2tlbnModGV4dCwgd2hhdCA9ICd3b3JkJykpXQp1c19jb3JwdXNfZHRbLCBuc2VudCA6PSBudG9rZW4ocXVhbnRlZGE6OnRva2Vucyh0ZXh0LCB3aGF0ID0gJ3NlbnRlbmNlJykpXQoKdXNfY29ycHVzX2R0Wywgd3BzIDo9IG50b2tzL25zZW50XQoKIyBvcmRlciBieSB5ZWFyCnVzX2NfZHRfeWVhciA9IHVzX2NvcnB1c19kdFtvcmRlcihZZWFyKSwgXQoKIyBwbG90CnBsb3QodXNfY19kdF95ZWFyJFllYXIKICAgICAsIHVzX2NfZHRfeWVhciR3cHMKICAgICAsIHhsYWI9J1llYXInCiAgICAgLCB5bGFiPSdXUFMnKQpgYGAKCgoKIyMgVGFzayAyOiBGaXJzdCBzdGVwcyB3aXRoIHJlYWwgZGF0YXNldHMKClVzZSB0aGUgZGF0YSBvZiBzdGF0ZW1lbnRzIG9uIHRydXRoZnVsIGFuZCBkZWNlcHRpdmUgd2Vla2VuZCBwbGFucyB0aGF0IHdhcyB0aGUgYmFzaXMgZm9yIFt0aGlzIHBhcGVyXShodHRwczovL3d3dy5zY2llbmNlZGlyZWN0LmNvbS9zY2llbmNlL2FydGljbGUvcGlpL1MwMDAxNjkxODIwMzA1NzQ2KS4gWW91IGNhbiBmaW5kIHRoZSByYXcgdGV4dHVhbCBkYXRhIG9uIHRoZSBPU0Y6IFtodHRwczovL29zZi5pby9ydHE5eV0oaHR0cHM6Ly9vc2YuaW8vcnRxOXkpLiAKClRoZSBwYXJ0aWNpcGFudHMgd2VyZSBhc2tlZCB0byBlaXRoZXIgdGVsbCB0aGUgdHJ1dGggYWJvdXQgdGhlaXIgcGxhbnMgZm9yIHRoZSB1cGNvbWluZyB3ZWVrZW5kIG9yIHdlcmUgYXNzaWduZWQgYW4gYWN0aXZpdHkgZnJvbSBzb21lb25lIGVsc2UgYW5kIGhhZCB0byBsaWUgYWJvdXQgaXQgKGkuZS4sIGZhYnJpY2F0ZSBhIHN0b3J5KS4KCkVhY2ggcGFydGljaXBhbnQgd2FzIGFza2VkIHR3byBwcm92aWRlIHR3byBzdGF0ZW1lbnRzICgxLiBQbGVhc2Ugd3JpdGUgYWJvdXQgeW91ciB3ZWVrZW5kIHBsYW5zIGluIGFzIG11Y2ggZGV0YWlsIGFzIHBvc3NpYmxlLjsgMi4gV2hpY2ggaW5mb3JtYXRpb24gY291bGQgcHJvdmUgdGhhdCB5b3UgYXJlIHRlbGxpbmcgdGhlIHRydXRoPykuIEZvY3VzIG9uIHRoZSBmaXJzdCBxdWVzdGlvbiAoY2FsbGVkIGBxMWAgaW4gdGhlIGRhdGFzZXQpLgoKVGhlIHZhcmlhYmxlIGBvdXRjb21lX2NsYXNzYCBpcyBlaXRoZXIgYHRgICh0cnV0aGZ1bCkgb3IgYGRgIChkZWNlcHRpdmUpLgoKIyMjIEV4ZXJjaXNlIDIuMSAKCioqV2hhdCBpcyB0aGUgZWZmZWN0IHNpemUgKENvaGVuJ3MgZCkgZm9yIHRoZSBkaWZmZXJlbmNlIGluIHdvcmRzIHBlciBzZW50ZW5jZSBiZXR3ZWVuIHRydXRoZnVsIGFuZCBkZWNlcHRpdmUgc3RhdGVtZW50cz8qKgoKYGBge3J9CiMgbG9hZGluZyB0aGUgZGF0YSAoaGVyZTogYWxsIGluIHRoZSBkYXRhLnRhYmxlIGZsb3cpCgpleF9kYXRhID0gZnJlYWQoJy9Vc2Vycy9iZW5uZXR0a2xlaW5iZXJnL0dpdEh1Yi9zbmxwL2RhdGEvc2lnbl9ldmVudHNfZGF0YV9zdGF0ZW1lbnRzLmNzdicpCgpuYW1lcyhleF9kYXRhKQoKIyBhZGRpbmcgdGhlIGNvbHVtbnMKZXhfZGF0YVssIG50b2tzIDo9IG50b2tlbihxdWFudGVkYTo6dG9rZW5zKHExLCB3aGF0ID0gJ3dvcmQnKSldCmV4X2RhdGFbLCBuc2VudCA6PSBudG9rZW4ocXVhbnRlZGE6OnRva2VucyhxMSwgd2hhdCA9ICdzZW50ZW5jZScpKV0KCmV4X2RhdGFbLCB3cHMgOj0gbnRva3MvbnNlbnRdCgoKIyBvYnRhaW5pbmcgdGhlIGVmZmVjdCBzaXplIGZvciB0aGUgd3BzIH4gb3V0Y29tZV9jbGFzcwojIyBIZXJlIHVzaW5nIHRoZSBlZmZlY3RzaXplIHBhY2thZ2UKbGlicmFyeShlZmZlY3RzaXplKQpjb2hlbnNfZChkYXRhID0gZXhfZGF0YQogICAgICAgICAsIHdwcyB+IG91dGNvbWVfY2xhc3MsIGNpID0gLjk1KQoKIyMgRGVzY3JpcHRpdmVzIG9mIHRoYXQgZWZmZWN0IHdpdGggZGF0YS50YWJsZQpleF9kYXRhWywgLignTScgPSBtZWFuKHdwcykKICAgICAgICAgICAgLCAnU0QnID0gc2Qod3BzKSkKICAgICAgICAsIGJ5ID0gLihvdXRjb21lX2NsYXNzKV0KYGBgCgojIyBUYXNrIDM6IFJlcGxpY2F0aW5nIFppcGYncyBMYXcKCkEgY3VyaW91cyAibGF3IiBpbiBjb3JwdXMgbGluZ3Vpc3RpY3MgaXMgWmlwZidzIExhdyAoW1lvdVR1YmUgaGVyZV0oaHR0cHM6Ly93d3cueW91dHViZS5jb20vd2F0Y2g/dj1mQ244enM5MTJPRSkpLgoKWmlwZidzIExhdyBkZXNjcmliZXMgdGhlIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIHRoZSBmcmVxdWVuY3kgb2Ygd29yZHMgaW4gYSBsYW5ndWFnZSBhbmQgdGhlaXIgcmFuayBpbiBhIGZyZXF1ZW5jeS1zb3J0ZWQgbGlzdDogdGhlIGZyZXF1ZW5jeSBvZiBhbnkgd29yZCBpcyBpbnZlcnNlbHkgcHJvcG9ydGlvbmFsIHRvIGl0cyByYW5rIGluIHRoZSBmcmVxdWVuY3kgdGFibGUuCgoKS2V5IGFzcGVjdHMgb2YgWmlwZidzIExhdzoKCi0gV29yZCBmcmVxdWVuY3kgZGlzdHJpYnV0aW9uOiBJbiBhIGxhcmdlIGVub3VnaCBjb2xsZWN0aW9uIG9mIHRleHRzLCB0aGUgbW9zdCBjb21tb24gd29yZCBvY2N1cnMgYWJvdXQgdHdpY2UgYXMgb2Z0ZW4gYXMgdGhlIHNlY29uZCBtb3N0IGZyZXF1ZW50IHdvcmQsIHRocmVlIHRpbWVzIGFzIG9mdGVuIGFzIHRoZSB0aGlyZCBtb3N0IGZyZXF1ZW50IHdvcmQsIGV0Yy4KLSBNYXRoZW1hdGljYWwgZm9ybXVsYXRpb246IFRoZSBsYXcgY2FuIGJlIGV4cHJlc3NlZCBhcyAkZih3KSBcYXBwcm94IFxmcmFjezF9e3J9JCwgd2hlcmUgJGYodykkIGlzIHRoZSBmcmVxdWVuY3kgb2Ygd29yZCAkdyQgYW5kIHIgaXMgdGhlIHJhbmsgb2YgdGhlIHdvcmQuCi0gVW5pdmVyc2FsaXR5OiBUaGlzIGRpc3RyaWJ1dGlvbiBpcyBvYnNlcnZlZCBhY3Jvc3MgdmFyaW91cyBsYW5ndWFnZXMsIGluY2x1ZGluZyBjaGlsZHJlbidzIHNwZWVjaCBhbmQgc3BlY2lhbGl6ZWQgdm9jYWJ1bGFyaWVzLgoKCiMjIyBFeGVyY2lzZSAzLjEKClRoZSBkYXRhc2V0IHdlIHdpbGwgdXNlIGZvciB0aGlzIGV4ZXJjaXNlIHN0ZW1zIGZyb20gd29yayBhIFtwYXBlciBvbiBhbmFseXNpbmcgbmFycmF0aXZlIHNoYXBlcyBpbiBZb3VUdWJlIHZsb2cgdHJhbnNjcmlwdHNdKGh0dHBzOi8vYWNsYW50aG9sb2d5Lm9yZy9EMTgtMTM5NC8pLiBJbiB0aGF0IHBhcGVyLCB0aGUgdmlkZW8gdHJhbnNjcmlwdHMgb2YgMzBrIHZsb2dzIHdlcmUgYW5hbHlzZWQuIFRoZSBkYXRhc2V0IGNhbiBiZSBsb2FkZWQgYXMgZm9sbG93czoKCmBgYHtyfQpsb2FkKCcvVXNlcnMvYmVubmV0dGtsZWluYmVyZy9HaXRIdWIvc25scC9kYXRhL3Zsb2dzX2NvcnB1cy5SRGF0YScpCgp2bG9nc19jb3JwdXMKYGBgCgoqKkRvZXMgWmlwZidzIExhdyBhcHBseSB0byBhIGNvcnB1cyBvZiBZb3VUdWJlIHZsb2cgdHJhbnNjcmlwdHM/KioKCl9IaW50OiB5b3Ugd2lsbCBuZWVkIHRvIG9idGFpbiB0aGUgbW9zdCBjb21tb24gd29yZHMgZm9yIHRoaXMgYW5hbHlzaXMgZnJvbSB0aGF0IGNvcnB1cy4gSGF2ZSBhIGxvb2sgYXQgdGhlIFtgdG9wZmVhdHVyZXMoKWAgZnVuY3Rpb25dKGh0dHBzOi8vd3d3LnJkb2N1bWVudGF0aW9uLm9yZy9wYWNrYWdlcy9xdWFudGVkYS92ZXJzaW9ucy8xLjMuMTMvdG9waWNzL3RvcGZlYXR1cmVzKS4gSGVyZSwgcHV0IHlvdXIgdG9rZW5pc2VkIG9iamVjdCBpbnRvIGEgYGRmbWAgKHdlIHdpbGwgbGVhcm4gbW9yZSBhYm91dCB0aGUgZGZtIGluIHRoZSBuZXh0IHBhcnQpLl8KCmBgYHtyfQojIGNyZWF0ZSBhIGNvcnB1cyBvYmplY3QKY192bG9nc19jb3JwdXMgPSBjb3JwdXModmxvZ3NfY29ycHVzKQoKIyBub3RlIGhvdyB5b3UgYXV0b21hdGljYWxseSByZXRhaW4gdGhlIGRvY3VtZW50LWxldmVsIHZhcmlhYmxlcwpkb2N2YXJzKGNfdmxvZ3NfY29ycHVzKQoKIyB0b2tlbmlzZSB0aGUgY29ycHVzCnRva3NfYyA9IHF1YW50ZWRhOjp0b2tlbnMoY192bG9nc19jb3JwdXMpCgojIHJlbWVtYmVyIHRoYXQgWmlwZidzIExhdyBzdGF0ZXMgdGhhdCB0aGUgZnJlcXVlbmN5IG9mIGEgd29yZCBpcyBpbnZlcnNlbHkgcHJvcG9ydGlvbmFsIHRvIHRoYXQgd29yZCdzIHJhbmsKIyMgd2UgZmlyc3Qgb2J0YWluIHRoZSBtb3N0IGNvbW1vbiB3b3JkcwoKdG9wXzEwMCA9IHRvcGZlYXR1cmVzKGRmbSh0b2tzX2MpLCBuID0gMTAwKQoKIyMgd2UgY2FuIHRoZW4gY3JlYXRlICJwcmVkaWN0aW9ucyIgYnkgWmlwZidzIExhdwp6aXBmX3ByZWQgPSAxLzE6MTAwCgojIyB0aGUgcHJlZGljdGlvbiAoaWYgWmlwZidzIExhdyB3b3VsZCBiZSBhdCBwbGF5KSB3b3VsZCBiZQpwbG90KHggPSAxOjEwMAogICAgICwgeSA9IHppcGZfcHJlZAogICAgICwgeGxhYiA9ICdPYnNlcnZlZCByYW5rJwogICAgICwgeWxhYiA9ICdQcmVkLiBhY2MuIHRvIFppcGYnCiAgICAgLCB0eXBlPSdsJwogICAgICwgY29sID0gJ2JsdWUnKQoKCiMjIG9uIG91ciBkYXRhCnBsb3QoeCA9IDE6MTAwCiAgICAgLCB5ID0gdG9wXzEwMAogICAgICwgeGxhYj0nT2JzZXJ2ZWQgcmFuaycKICAgICAsIHlsYWI9J0ZyZXF1ZW5jeScKICAgICAsIHR5cGU9J2wnCiAgICAgLCBjb2wgPSAncmVkJykKCmBgYAoKIyMjIEV4ZXJjaXNlIDMuMiAKCioqSG93IGRvIHRoZSB3b3JkIGZyZXF1ZW5jeSByYW5rcyBpbiB0aGUgdmxvZ3MgY29ycHVzIGRldmlhdGUgZnJvbSBHb29nbGUncyAxIFRyaWxsaW9uIFdvcmQgQ29ycHVzIGZyZXF1ZW5jeSByYW5rcz8qKgoKWW91IGNhbiBmaW5kIGEgcmFua2VkIGxpc3Qgb2Ygd29yZCBmcmVxdWVuY2llcyBmcm9tIGZyb20gR29vZ2xlJ3MgVHJpbGxpb24gV29yZCBDb3JwdXMgYXQ6IFtodHRwczovL2dpdGh1Yi5jb20vZmlyc3QyMGhvdXJzL2dvb2dsZS0xMDAwMC1lbmdsaXNoXShodHRwczovL2dpdGh1Yi5jb20vZmlyc3QyMGhvdXJzL2dvb2dsZS0xMDAwMC1lbmdsaXNoKS4gSXQgaXMgYWxzbyBwcm92aWRlZCBpbiB0aGUgYGRhdGFgIGRpcmVjdG9yeSBvZiB0aGlzIHJlcG8gKGAuL2RhdGEvZ29vZ2xlXzEwa19saXN0LnR4dGApLiBUaGVzZSBkYXRhIGFyZSBhbHJlYWR5IGluIHJhbmtlZCBvcmRlcjsgdGhlIGZpbGUgZG9lcyBub3QgY29udGFpbiBhIGhlYWRlciAoc28gc2V0OiBgaGVhZGVyPUZgKS4KCmBgYHtyfQojIGxvYWQgdGhlIGdvb2dsZSBmcmVxdWVuY2llcwpnb29nbGVfcmFua3MgPSBmcmVhZCgnL1VzZXJzL2Jlbm5ldHRrbGVpbmJlcmcvR2l0SHViL3NubHAvZGF0YS9nb29nbGVfMTBrX2xpc3QudHh0JwogICAgICAgICAgICAgICAgICAgICAgLCBoZWFkZXI9RikKCiMgYXNzaWduIHJhbmsgdmFyaWFibGUKZ29vZ2xlX3JhbmtzWywgcmFuayA6PSAxOi5OXQoKIyByZW5hbWUgdmFyaWFibGUKbmFtZXMoZ29vZ2xlX3JhbmtzKVsxXSA9ICd3b3JkX2dvb2dsZScKCiMgc2VsZWN0IG9ubHkgdG9wIDEwMApnb29nbGVfdG9wXzEwMCA9IGdvb2dsZV9yYW5rc1sxOjEwMCwgXQoKIyBnZXQgdG9wIDEwMCBmcm9tIHZsb2dzIHRvIGRhdGEudGFibGUgLyBkYXRhLmZyYW1lCnRvcF8xMDBfZHQgPSBzZXREVChkYXRhLmZyYW1lKHdvcmRfdmxvZyA9IG5hbWVzKHRvcF8xMDApCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICwgcmFuayA9IDE6MTAwKSkKCiMgbWVyZ2UgYm90aApyYW5rc19tZXJnZWQgPSBtZXJnZShnb29nbGVfdG9wXzEwMCwgdG9wXzEwMF9kdCwgYnk9J3JhbmsnKQpyYW5rc19tZXJnZWQKYGBgCgotLS0=